feat: wire TreeSitterChunker into LibScopeLite.index() via preChunked#461
Merged
RobertLD merged 1 commit intofix/export-treesitter-chunkerfrom Mar 19, 2026
Merged
Conversation
Add `preChunked?: string[]` to `IndexDocumentInput` — when provided, `indexDocument` skips the markdown chunker and uses the caller's chunks directly. `LibScopeLite.index()` now checks `doc.language`: if set and supported, it pre-chunks the content with `TreeSitterChunker` and passes the result as `preChunked`. Falls back silently to the text chunker on any error (tree-sitter not installed, parse failure, etc.). Consumers set `language: "cpp"` (or any supported alias) on their `LiteDoc` and get function/class-boundary chunks automatically. Docs updated to note this as the preferred approach over using `TreeSitterChunker` directly. Co-Authored-By: Claude Sonnet 4.6 (1M context) <noreply@anthropic.com>
|
The latest updates on your projects. Learn more about Vercel for GitHub. |
|
RobertLD
added a commit
that referenced
this pull request
Mar 19, 2026
* fix: export TreeSitterChunker and CodeChunk from libscope/lite TreeSitterChunker was compiled but not re-exported from the ./lite entry point, making it inaccessible to consumers using the package exports map. Co-Authored-By: Claude Sonnet 4.6 (1M context) <noreply@anthropic.com> * feat: wire TreeSitterChunker into LibScopeLite.index() via preChunked (#461) Add `preChunked?: string[]` to `IndexDocumentInput` — when provided, `indexDocument` skips the markdown chunker and uses the caller's chunks directly. `LibScopeLite.index()` now checks `doc.language`: if set and supported, it pre-chunks the content with `TreeSitterChunker` and passes the result as `preChunked`. Falls back silently to the text chunker on any error (tree-sitter not installed, parse failure, etc.). Consumers set `language: "cpp"` (or any supported alias) on their `LiteDoc` and get function/class-boundary chunks automatically. Docs updated to note this as the preferred approach over using `TreeSitterChunker` directly. Co-authored-by: Claude Sonnet 4.6 (1M context) <noreply@anthropic.com> * style: fix prettier formatting Co-Authored-By: Claude Sonnet 4.6 (1M context) <noreply@anthropic.com> --------- Co-authored-by: Claude Sonnet 4.6 (1M context) <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.



Summary
preChunked?: string[]toIndexDocumentInput— when provided,indexDocumentuses these chunks directly, bypassing the markdown chunkerLibScopeLite.index()now checksdoc.language: if set and supported byTreeSitterChunker, pre-chunks the content at function/class boundaries and passes result aspreChunked; falls back silently to the text chunker on any errorLiteDocdocs (lite.md,lite-api.md) to mark settinglanguageas the preferred approach over usingTreeSitterChunkerdirectlylite.test.tsandindexing.test.tsTest plan
npm run typecheck— no new errorsnpm test— 1488 tests pass (7 new)index()with supported language →chunk()called, results passed aspreChunkedindex()with unsupported/no language → text chunker used, no exceptionindex()when tree-sitter throws → falls back silently, indexing succeedsindexDocumentwithpreChunked→ chunks stored verbatim in DBindexDocumentwith empty/undefinedpreChunked→ normal text chunking🤖 Generated with Claude Code